DML
Commands. Hive Data Manipulation Language commands are used for
inserting, retrieving, modifying, deleting, and updating data in the
Hive table. In this article, we will learn Hive DML commands. There are
many Hive DML commands like LOAD, INSERT, UPDATE, etc. We will explore
each of these DML commands individually, along with their syntax and
examples.
The various Hive DML commands are:
- LOAD
- SELECT
- INSERT
- DELETE
- UPDATE
- EXPORT
- IMPORT
LOAD Command
The LOAD statement in Hive is used to move data files into the locations corresponding to Hive tables.
The LOAD statement in Hive is used to move data files into the locations corresponding to Hive tables.
- If a LOCAL keyword is specified, then the LOAD command will look for the file path in the local filesystem.
- If the LOCAL keyword is not specified, then the Hive will need the absolute URI of the file.
- In case the keyword OVERWRITE is specified, then the contents of the target table/partition will be deleted and replaced by the files referred by filepath.
- If the OVERWRITE keyword is not specified, then the files referred by filepath will be appended to the table.
Syntax:
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)];
Here we are trying to load data from the ‘dab’ file in the local filesystem to the ‘emp_data’ table.
![LOAD statement Hive DML Command](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/LOAD-statement-Hive-DML-Command1.png)
![SELECT_statment_to_view_data_loaded_in_hive_table Hive DML Command](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/SELECT_statment_to_view_data_loaded_in_hive_table-Hive-DML-Command2.png)
SELECT Command
The SELECT statement in Hive is similar to the SELECT statement in SQL used for retrieving data from the database.
The SELECT statement in Hive is similar to the SELECT statement in SQL used for retrieving data from the database.
Syntax:
SELECT col1,col2 FROM tablename;
![SELECT_statement - Hive-DML-Command](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/SELECT_statementHive-DML-Command3.png)
INSERT Command
The INSERT command in Hive loads the data into a Hive table. We can do insert to both the Hive table or partition.
INSERT INTO
The INSERT INTO statement appends the data into existing data in the table or partition. INSERT INTO statement works from Hive version 0.8.
Syntax:
INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 FROM from_statement;
Syntax:
INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 FROM from_statement;
Here in this example, we are trying to insert the data of ‘emp_data’ table created above into the table ‘example’.
![https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/Creating_table_example-Hive-DML-Command4.png](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/Creating_table_example-Hive-DML-Command4.png)
INSERT statement to load data into table “example”.
![INSERT_INTO_statement - Hive-DML-Command](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/INSERT_INTO_statement-Hive-DML-Command5.png)
![Displaying_result_of_insert_into - Hive DML Commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/Displaying_result_of_insert_into-Hive-DML-Command6.png)
INSERT OVERWRITE
The INSERT OVERWRITE table overwrites the existing data in the table or partition.
Syntax:
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, ..) [IF NOT EXISTS]] select_statement FROM from_statement;
The INSERT OVERWRITE table overwrites the existing data in the table or partition.
Syntax:
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, ..) [IF NOT EXISTS]] select_statement FROM from_statement;
Here we are overwriting the existing data of the table ‘example’ with the data of table ‘dummy’ using INSERT OVERWRITE statement.
![INSERT_OVERWRITE_statement - Hive DML Commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/INSERT_OVERWRITE_statement-Hive-DML-Command7.png)
![Executing_insert_overwrite - Hive DML Commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/Executing_insert_overwrite-Hive-DML-Command8.png)
By using the SELECT statement we can verify whether the existing data of
the table ‘example’ is overwritten by the data of table ‘dummy’ or not.
![displaying_result_of_insert_overwrite - Hive DML Command](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/displaying_result_of_insert_overwrite-Hive-DML-Command9.png)
INSERT .. VALUES
INSERT ..VALUES statement in Hive inserts data into the table directly from SQL. It is available from Hive 0.14.
Syntax:
INSERT INTO TABLE tablename [PARTITION (partcol1[=val1], partcol2[=val2] ...)] VALUES values_row [, values_row ...];
Example:
Inserting data into the ‘student’ table using INSERT ..VALUES statement.
INSERT ..VALUES statement in Hive inserts data into the table directly from SQL. It is available from Hive 0.14.
Syntax:
INSERT INTO TABLE tablename [PARTITION (partcol1[=val1], partcol2[=val2] ...)] VALUES values_row [, values_row ...];
Example:
Inserting data into the ‘student’ table using INSERT ..VALUES statement.
![https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/INSERT_VALUES_statement-Hive-DML-Command10.png](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/INSERT_VALUES_statement-Hive-DML-Command10.png)
![displaying_result_of_insert_values_statement - Hive DML commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/displaying_result_of_insert_values_statement-Hive-DML-Command11.png)
DELETE command
The DELETE statement in Hive deletes the table data. If the WHERE clause is specified, then it deletes the rows that satisfy the condition in where clause.
The DELETE statement can only be used on the hive tables that support ACID.
Syntax:
DELETE FROM tablename [WHERE expression];
Example:
In the below example, we are deleting the data of the student from table ‘student’ whose roll_no is 105.
The DELETE statement in Hive deletes the table data. If the WHERE clause is specified, then it deletes the rows that satisfy the condition in where clause.
The DELETE statement can only be used on the hive tables that support ACID.
Syntax:
DELETE FROM tablename [WHERE expression];
Example:
In the below example, we are deleting the data of the student from table ‘student’ whose roll_no is 105.
![DELETE_statement - Hive DML Commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/DELETE_statement-Hive-DML-Command12.png)
![executing_DELETE_statement - Hive DML Commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/executing_DELETE_statement-Hive-DML-Command13.png)
By using the SELECT statement we can verify whether the data of the
student from table ‘student’ whose roll_no is 105 is deleted or not.
![displaying_result_after_delete - Hive DML Commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/displaying_result_after_delete-Hive-DML-Command14.png)
UPDATE Command
The update can be performed on the hive tables that support ACID.
The UPDATE statement in Hive deletes the table data. If the WHERE clause is specified, then it updates the column of the rows that satisfy the condition in WHERE clause.
Partitioning and Bucketing columns cannot be updated.
Syntax:
UPDATE tablename SET column = value [, column = value ...] [WHERE expression];
Example:
In this example, we are updating the branch of the student whose roll_no is 103 in the ‘student’ table using an UPDATE statement.
The update can be performed on the hive tables that support ACID.
The UPDATE statement in Hive deletes the table data. If the WHERE clause is specified, then it updates the column of the rows that satisfy the condition in WHERE clause.
Partitioning and Bucketing columns cannot be updated.
Syntax:
UPDATE tablename SET column = value [, column = value ...] [WHERE expression];
Example:
In this example, we are updating the branch of the student whose roll_no is 103 in the ‘student’ table using an UPDATE statement.
![UPDATE-statement - Hive DML Commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/UPDATE-statement-Hive-DML-Command15.png)
![executing_update_statement - Hive DML Commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/executing_update_statement-Hive-DML-Command16.png)
By using the SELECT statement we can verify whether the branch of the
student whose roll_no is 103 in the ‘student’ table is updated or not.
![displaying_result_after_update - Hive DML Commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/displaying_result_after_update-Hive-DML-Command17.png)
EXPORT Command
The Hive EXPORT statement exports the table or partition data along with the metadata to the specified output location in the HDFS.
Metadata is exported in a _metadata file, and data is exported in a subdirectory ‘data.’
Syntax:
EXPORT TABLE tablename [PARTITION (part_column="value"[, ...])] TO 'export_target_path' [ FOR replication('eventid') ];
The Hive EXPORT statement exports the table or partition data along with the metadata to the specified output location in the HDFS.
Metadata is exported in a _metadata file, and data is exported in a subdirectory ‘data.’
Syntax:
EXPORT TABLE tablename [PARTITION (part_column="value"[, ...])] TO 'export_target_path' [ FOR replication('eventid') ];
Here in this example, we are exporting the student table to the HDFS directory “export_from_hive”.
![EXPORT-statement - Hive DML Commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/EXPORT-statement-Hive-DML-Command18.png)
![executing_export_statement](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/executing_export_statement-Hive-DML-Command19.png)
The table successfully exported. You can check for the _metadata file and data sub-directory using ls command.
![displaying_exported_table_directory_Hive DML Commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/displaying_exported_table_directory_Hive-DML-Command20.png)
IMPORT Command
The Hive IMPORT command imports the data from a specified location to a new table or already existing table.
Syntax:
IMPORT [[EXTERNAL] TABLE new_or_original_tablename [PARTITION (part_column="value"[, ...])]]
FROM 'source_path' [LOCATION 'import_target_path'];
The Hive IMPORT command imports the data from a specified location to a new table or already existing table.
Syntax:
IMPORT [[EXTERNAL] TABLE new_or_original_tablename [PARTITION (part_column="value"[, ...])]]
FROM 'source_path' [LOCATION 'import_target_path'];
Here in this example, we are importing the data exported in the above example into a new Hive table ‘imported_table’.
![IMPORT-statement-Hive-DML-Commands](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/IMPORT-statement-Hive-DML-Command21.png)
![https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/executing_import_statement-Hive-DML-Command22.png](https://d2h0cx97tjks2p.cloudfront.net/blogs/wp-content/uploads/sites/2/2020/03/executing_import_statement-Hive-DML-Command22.png)
No comments:
Post a Comment